Back to Article
Code and Lab 6
Download Source

Code and Lab 6

Authors
Affiliation

Cameron McLaughlin

The University

Alex Smilor

The University

Lab 7 Project Kickoff

Github: CamGit12/ESS330ProjectProposals

Title: Elwha River - Glines Canyon Dam Removal Effects; Turbidity and Discharge

Introduction, Background, Motivation

Dams have long been built by humans as a means of protecting human development from flooding, diverting water for crop irrigation or, in modern times, to generate power to meet the growing energy demands of an industrialized world (Ryan Bellmore et al., 2017). In the United States, there are estimated to be over 2 million dams throughout the country, though many of these are small in size and aging Ryan Bellmore et al. (2017). Many of the dams in the US are currently over 50 years old and, this age combined with shifts in scientific knowledge and societal values, have promoted the popularity of dam removal as an option for these aging dams (duda2008baseline, Ryan Bellmore et al., 2017). While dam removal is often done for economic reasons, since the cost of repairing dams can be prohibitively expensive, it has also recently gained popularity as a valuable method for restoring riverine ecosystems (duda2008baseline, Ryan Bellmore et al., 2017). For these reasons and more, the US has recently seen the removal of over 1000 dams in recent decades, though the effects of these dam removal on riverine ecosystems generally go unstudied (Ryan Bellmore et al., 2017).

Though river restoration is often one of the goals of dam removal, especially in recent years, the consequences of dam removal are nuanced, with both benefits and costs that must be considered prior to and following dam removal efforts (Duda et al., 2008). Dam removal is known to have impacts on the physical and biological properties of river systems, with potential consequences for river health, human health, local economy and biodiversity. Removing dams can often serve as an important step in reconnecting riparian habitats, allowing for the flooding of wetlands and creating a more complex and better connected riparian habitat matrix that can benefit some species such as wide ranging terrestrial species and species reliant on the presence of ephemeral water bodies (Bednarek, 2001). Dam removal can also improve the connectivity of aquatic habitats, allowing fish and aquatic invertebrates to travel freely up and downstream, often benefiting migratory fish such as salmon (Bednarek, 2001). Physical properties, such as temperature, are also often affected by dam removal. Dams often lead to a decrease in water temperature downstream of the river, since many dams draw water from the cooler depths of their reservoirs (Bednarek, 2001). Removal, therefore, is often an effective method of restoring river water temperatures to their natural state. Additionally, sediment transport often changes following dam removal as previously trapped sediment is released. However, how sediment transport changes is highly variable, with some streams seeing increases in sediment mobilization immediately following dam removal and others not seeing much mobilization until months following removal Johnson (2001).

This project will focus on the sediment flows following one of the largest dam removal projects in the United States. The Elwha River, located in northern Washington in Olympic National Park upstream of the Strait of Juan de Fuca, saw the removal of two large dams between 2011 and 2012, which at the time represented the largest dam removal and river restoration project in US history (Duda et al., 2011). As part of these dam removal projects and to aid in river restoration assessment, extensive monitoring efforts of flow conditions and water quality metrics were undertaken, with the goal of better understanding how dam removal on this scale affects watershed health. As a result, a uniquely high amount of research was produced in relation to this project, providing a unique opportunity to study the ongoing effects of such a monumental project (Randle et al., 2015). In fact, there were many researchers focused on the geomorphic responses to these dam removals (Mussman et al., 2008), but the need for continued analyses to ongoing changes and better understandings of the dynamic impacts to water quality as a result of the dam removal impacts remains.

This project will look to assist in the maintained study of the dual-dam removal on the Elwha River and the water quality dynamics that occurred downstream, specifically by examining the relationship between upstream peak daily discharge and downstream daily turbidity measures. Turbidity, or the amount of suspended particles is important to investigate due to the high potential for increased sediment flow following dam removal, which could impact water quality further downstream as massive amounts of sediment are released from behind the dams and bring potentially harmful pollutants with them (Hart et al., 2002). In the case of the Elwha river dams, there was an estimated 19 million m3 of sediment trapped behind both dams combined, making the potential water quality impacts high (Duda et al., 2011). Beyond the potential water quality impacts, turbidity can also have negative impacts on biota, specifically migratory fish species, with turbidity negatively associated with salmon swimming speed (Lehman et al., 2017). Given that reconnecting salmon and trout runs was one of the key goals of this project, understanding potential challenges associated with dam removal is important for both achieving project goals and learning how best these effects can be managed (Duda et al., 2011).

This project will draw from two data sources, upstream river discharge USGS gauge (Elwha River at Mcdonald BR Near Port Angeles, WA, 12045500) and downstream river water quality USGS gauge (Elwha River at Diversion Near Port Angeles, WA, 12046260). The two variables we will be testing against each other will be Daily Peak Discharge and Daily Peak Turbidity. To test whether or not there is a relationship between these two variables, we will use a Kolmogorov-Smirnov test, since our dataset is not normally distributed and the data set is large (Mishra et al., 2019).

EDA

Project Libraries

In [1]:
Code
#Libraries 
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
Code
library(lubridate)

Attaching package: 'lubridate'
The following objects are masked from 'package:base':

    date, intersect, setdiff, union
Code
library(dataRetrieval)
library(tidyr)
library(ggplot2)
library(readr)
library(ggpubr)
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ forcats 1.0.0     ✔ stringr 1.5.1
✔ purrr   1.0.4     ✔ tibble  3.2.1
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Code
library(plotly)

Attaching package: 'plotly'

The following object is masked from 'package:ggplot2':

    last_plot

The following object is masked from 'package:stats':

    filter

The following object is masked from 'package:graphics':

    layout
Code
library(tsibble)
Registered S3 method overwritten by 'tsibble':
  method               from 
  as_tibble.grouped_df dplyr

Attaching package: 'tsibble'

The following object is masked from 'package:lubridate':

    interval

The following objects are masked from 'package:base':

    intersect, setdiff, union
Code
library(forecast)
Registered S3 method overwritten by 'quantmod':
  method            from
  as.zoo.data.frame zoo 

Attaching package: 'forecast'

The following object is masked from 'package:ggpubr':

    gghistogram
Code
library(feasts)
Loading required package: fabletools
Code
library(modeltime)
library(timeSeries)
Loading required package: timeDate

Attaching package: 'timeSeries'

The following object is masked from 'package:dplyr':

    lag

The following objects are masked from 'package:graphics':

    lines, points
Code
library(timetk)
library(tidymodels)
── Attaching packages ────────────────────────────────────── tidymodels 1.2.0 ──
✔ broom        1.0.7     ✔ rsample      1.2.1
✔ dials        1.3.0     ✔ tune         1.2.1
✔ infer        1.0.7     ✔ workflows    1.1.4
✔ modeldata    1.4.0     ✔ workflowsets 1.1.0
✔ parsnip      1.2.1     ✔ yardstick    1.3.1
✔ recipes      1.1.0     
── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
✖ yardstick::accuracy() masks fabletools::accuracy(), forecast::accuracy()
✖ scales::discard()     masks purrr::discard()
✖ timeSeries::filter()  masks plotly::filter(), dplyr::filter(), stats::filter()
✖ recipes::fixed()      masks stringr::fixed()
✖ infer::generate()     masks fabletools::generate()
✖ infer::hypothesize()  masks fabletools::hypothesize()
✖ timeSeries::lag()     masks dplyr::lag(), stats::lag()
✖ parsnip::null_model() masks fabletools::null_model()
✖ yardstick::spec()     masks readr::spec()
✖ recipes::step()       masks stats::step()
• Use tidymodels_prefer() to resolve common conflicts.
Code
library(earth)
Loading required package: Formula
Loading required package: plotmo
Loading required package: plotrix

Attaching package: 'plotrix'

The following object is masked from 'package:scales':

    rescale

Clean and Prep Data

In [2]:
Code
#Clean and Prep
raw_turbidity <- readNWISuv("12046260", "63682", startDate = "2013-09-18", endDate = "2019-12-19", tz = "UTC")
GET:https://nwis.waterservices.usgs.gov/nwis/iv/?site=12046260&format=waterml%2C1.1&ParameterCd=63682&startDT=2013-09-18&endDT=2019-12-19
Code
raw_discharge <- readNWISuv("12045500", "00060", startDate = "2013-09-18", endDate = "2019-12-19", tz = "UTC")
GET:https://nwis.waterservices.usgs.gov/nwis/iv/?site=12045500&format=waterml%2C1.1&ParameterCd=00060&startDT=2013-09-18&endDT=2019-12-19
Code
#cleaned

turbidity_clean <- raw_turbidity %>%
  rename(turbidity_fbu = X_63682_00000) %>% 
  select(-agency_cd, -site_no, -tz_cd, -X_63682_00000_cd)

discharge_clean <- raw_discharge %>%
  rename(discharge_cfs = X_00060_00000) %>% 
    select(-agency_cd, -site_no, -tz_cd, -X_00060_00000_cd)

#join data frames by datetime
#keep only rows that match 1:1 by datetime

joined_data <- inner_join(discharge_clean, turbidity_clean, by = "dateTime")

Our data is directly imported using the USGS dataRetrieval tool. Our data comes in directly through this tool as a csv, with various non-human readable column names. This code eliminates many of the unnecessary columns such as agency, site number, time zone, etc. Additionally, we have implemented renaming to name the correct parameter columns with their appropriate titles and units, Ultimately, our two dataframe with our two parameters needed to be joined so that we can run initial analyses and visualizations to inspect our complete dataset.

Initial Visualizations and EDA

In [3]:
Code
#Both Parameters Plotted, discharge Y-left, Turbidity Y-right
plot1 <- ggplot(joined_data, aes(x = dateTime)) +
  geom_point(aes(y = discharge_cfs, color = "Discharge (cfs)")) +
  geom_point(aes(y = turbidity_fbu, color = "Turbidity (FBU)")) +scale_y_continuous(name = "Discharge (cfs)", sec.axis = sec_axis(~ ., name = "Turbidity (FBU)")) + scale_x_datetime(date_labels = "%Y", date_breaks = "1 year") +
  labs(title = "Discharge and Turbidity over Time", x = "Date") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  scale_color_manual(values = c("Discharge (cfs)" = "blue", "Turbidity (FBU)" = "red")) +
  theme_minimal()

print(plot1)

Code
#Summarize
summary(joined_data)
    dateTime                      discharge_cfs   turbidity_fbu    
 Min.   :2013-09-18 08:00:00.00   Min.   :  213   Min.   :   0.00  
 1st Qu.:2015-03-29 22:03:45.00   1st Qu.:  588   1st Qu.:   5.00  
 Median :2016-10-31 04:37:30.00   Median : 1140   Median :  12.30  
 Mean   :2016-11-01 21:18:51.89   Mean   : 1496   Mean   :  88.03  
 3rd Qu.:2018-06-13 18:26:15.00   3rd Qu.: 1930   3rd Qu.:  47.00  
 Max.   :2019-12-19 18:30:00.00   Max.   :24400   Max.   :4590.00  
Code
# Histograms for distribution, discharge and turbidity
ggplot(joined_data, aes(x = discharge_cfs)) +
  geom_histogram(binwidth = 100, fill = "blue", color = "black", alpha = 0.7) +
  labs(title = "Histogram of Discharge (cfs)", x = "Discharge (cfs)", y = "Frequency") +
  theme_minimal()

Code
ggplot(joined_data, aes(x = turbidity_fbu)) +
  geom_histogram(binwidth = 100, fill = "blue", color = "black", alpha = 0.7) +
  labs(title = "Histogram of Turbidity (fbu)", x = "Turbidity (fbu)", y = "Frequency") +
  theme_minimal()

Code
#kolmogorov nromality test (very high n dataset)
kstest_Q <- ks.test(joined_data$discharge_cfs, "pnorm", mean=mean(joined_data$discharge_cfs), sd=sd(joined_data$discharge_cfs))
Warning in ks.test.default(joined_data$discharge_cfs, "pnorm", mean =
mean(joined_data$discharge_cfs), : ties should not be present for the
Kolmogorov-Smirnov test
Code
kstest_turb <- ks.test(joined_data$turbidity_fbu, "pnorm", mean=mean(joined_data$turbidity_fbu), sd=sd(joined_data$turbidity_fbu))
Warning in ks.test.default(joined_data$turbidity_fbu, "pnorm", mean =
mean(joined_data$turbidity_fbu), : ties should not be present for the
Kolmogorov-Smirnov test
Code
print(kstest_Q)

    Asymptotic one-sample Kolmogorov-Smirnov test

data:  joined_data$discharge_cfs
D = 0.18581, p-value < 2.2e-16
alternative hypothesis: two-sided
Code
print(kstest_turb)

    Asymptotic one-sample Kolmogorov-Smirnov test

data:  joined_data$turbidity_fbu
D = 0.36381, p-value < 2.2e-16
alternative hypothesis: two-sided

For our initial visualizations, we have plotted both parameters (discharge and turbidity) on the same plot, allowing us to visually interpret our data, and initially analyze whether there is correlation between our streamflow (Q) and turbidity levels. Additionally, in order to visually assess the distribution of our data, we made histograms which demonstrated very strong right skewing in both parameters, indicating that we should test for normality and likely keep the normality of our data in mind for future analyses. Our Kolmogorov test was a solution we found to our dataset being very large, resulting in needing this test to determine if our data was normal. Our results indicating the strong right skew (non-normal) we expected from our previous plots (p values < 2.2e-16).

Stat Test

##r_sq <- summary(lm_discharge)$r.squared ## print(r_sq)

In [4]:
Code
#perform spearmans rank corr test
spearman_corr <- cor.test(joined_data$discharge_cfs, joined_data$turbidity_fbu, method = "spearman")
Warning in cor.test.default(joined_data$discharge_cfs,
joined_data$turbidity_fbu, : Cannot compute exact p-value with ties
Code
#linear regression on both parameters
lm <- lm(discharge_cfs ~ turbidity_fbu, data = joined_data)

print(spearman_corr)

    Spearman's rank correlation rho

data:  joined_data$discharge_cfs and joined_data$turbidity_fbu
S = 6.1836e+14, p-value < 2.2e-16
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho 
0.5981768 
Code
print("r squared")
[1] "r squared"
Code
print(lm)

Call:
lm(formula = discharge_cfs ~ turbidity_fbu, data = joined_data)

Coefficients:
  (Intercept)  turbidity_fbu  
     1279.873          2.451  

For our final investigations into our data, we have ran initial tests for correlation via Spearman’s (non-normal dist. assumed) and a linear regression as well. Our results indicated that there is certainly correlation between our two parameters, but that the relationship is not linear.

Preliminary Methods

1.) Identify your preliminary methods Our preliminary methods, as seen above, are EDA and then initial tests of Spearman’s Correlation and Linear Regression in order to initially determine if there is a relationship between our two parameters. Further into our work we will be looking to integrate time into our perspectives and analyses, which will allow us to better understand how closely related our two parameters are as well as how closely they are linked on a temporal scale. One of our ideas for investigating this would be to test how time as affected turbidity after the dam removal, which would allow us to better visualize and understand how sediments are being transported from the reservoir downstream, and how this process is changing as time moves on after the removal.

2.) What are you thinking about using to analyze your data to answer your question? We are still working on determining our statistical methods but we are certain one of our next steps in analysis should be to determine th relationship of time and turbidity, controlling for discharge. This will allow us to better understand how turbidity continues to change post dam-removal.

3.) Do you have everything you need? What supplemental data might you need? Unfortunately our study is very specific case-wise, so there is no more available data within this watershed relating to the period of the dam removal, but there is a possibility we could integrated parallel data for other dam removals if we were able to find appropriate data.

4.) What are some potential challenges you see in the data? One issue we have found in our data so far is that the period of record for our discharge and turbidity are relatively short on the timescale of the dam removal. Ideally we would have captured turbidity levels before, during, and after the dam removal so that we could additionally analyse geomorphology regime changes but unfortunately this was not possible.

5.) What are some potential challenges you see in the methods? One challenge we have already found when investigating our data initially is that we have pairs in our data that is complicating our Spearman’s rank correlation test. This has caused problems because it complicates the Spearman’s correlation resulting p value into a estimation. In this way, we may need to find a different correlation method to use if this is a big enough issue.

6.) How do the selected methods help you get to your final goal? Ultimately our selected methods allow us to get further by describing and enlightening the relationship between discharge and turbidity levels in the Elwha River specifically below the Glines Canyon Dam Removal site. The analyses and plans we have will allow us to investigate how the relationship works, how it has changed, and how time has impacted this relationship post-dam removal.

Methods

Decomposition

In [5]:
Code
monthdata <- joined_data %>% 
  mutate(Date = yearmonth(dateTime)) %>% 
  group_by(Date) %>% 
  summarise(discharge_cfs = mean(discharge_cfs),
            turbidity_fbu = mean(turbidity_fbu))

elwha_tbl <- as_tsibble(monthdata)
Using `Date` as index variable.
Code
elwha_tsplot <- elwha_tbl %>% 
  ggplot(aes(x = Date)) +
  geom_line(aes(y = discharge_cfs, color = "Discharge")) +
  geom_line(aes(y = turbidity_fbu, color = "Turbidity")) +
  scale_y_continuous(name = "Discharge (cfs)", sec.axis = sec_axis(~ ., name = "Turbidity (FBU)")) + 
  labs(
    color = "",
    x = "Date"
  )

ggplotly(elwha_tsplot)
Code
gg_subseries(elwha_tbl, y = turbidity_fbu)+
  labs(title = "Monthly Turbity Patterns", y = "Turbidity (FBU)", x = "Year") + 
  theme_minimal()

Code
gg_subseries(elwha_tbl, y = discharge_cfs)+
  labs(title = "Monthly Flow Patterns", y = "Discharge (CFS)", x = "Year") + 
  theme_minimal()

Code
flow_decomp <- elwha_tbl %>% 
model(STL(discharge_cfs ~ season(window = 12))) %>% 
  components()

turbid_decomp <- elwha_tbl %>% 
model(STL(turbidity_fbu ~ season(window = 12))) %>% 
  components()

autoplot(flow_decomp) +
  labs(title = "STL Decomposition of Flow", y = "Cubic Feet per Second") +
  theme_minimal()

Code
autoplot(turbid_decomp) +
  labs(title = "STL Decomposition of Turbidity", y = "FBU") +
  theme_minimal()

Code
flow_lm<- lm(trend ~ Date, data = flow_decomp)
summary(flow_lm)

Call:
lm(formula = trend ~ Date, data = flow_decomp)

Residuals:
    Min      1Q  Median      3Q     Max 
-676.81 -176.43   47.15  257.47  414.53 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept) 3942.52549  855.69235   4.607 1.66e-05 ***
Date          -0.14498    0.05003  -2.898  0.00494 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 291.2 on 74 degrees of freedom
Multiple R-squared:  0.1019,    Adjusted R-squared:  0.08977 
F-statistic: 8.397 on 1 and 74 DF,  p-value: 0.004942
Code
ggplot(flow_decomp, aes(x = Date, y = trend)) +
  geom_point() + 
  geom_smooth(color = "red", method = "lm", formula = (y ~ x)) +
  theme_minimal() + 
  labs(
    x = "Date",
    y = "Discharge Trend (cfs)"
  )

Code
turbid_lm<- lm(trend ~ Date, data = turbid_decomp)
summary(turbid_lm)

Call:
lm(formula = trend ~ Date, data = turbid_decomp)

Residuals:
   Min     1Q Median     3Q    Max 
-61.49 -31.25 -18.39  39.83 115.04 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1.537e+03  1.374e+02   11.19   <2e-16 ***
Date        -8.487e-02  8.032e-03  -10.57   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 46.75 on 74 degrees of freedom
Multiple R-squared:  0.6014,    Adjusted R-squared:  0.596 
F-statistic: 111.7 on 1 and 74 DF,  p-value: < 2.2e-16
Code
ggplot(turbid_decomp, aes(x = Date, y = trend)) +
  geom_point() + 
  geom_smooth(color = "red", method = "lm", formula = (y ~ x)) +
  theme_minimal() + 
  labs(
    x = "Date",
    y = "Turbidity Trend (fbu)"
  )

Attempted Hindcasting

In [6]:
Code
#Reversing dataset to facilitate hindcasting. Basic plan is to just flip the data to forecast backwards
hindcast<- discharge_clean

#hindcast$dateTime <- rev(hindcast$dateTime) #Don't use, might not be needed
hindcast$discharge_cfs <- rev(hindcast$discharge_cfs) #reverses discharge data
#hindcast$turbidity_fbu <- rev(hindcast$turbidity_fbu) #reverses turbidity date

hindcast_daily <- hindcast %>% 
  mutate(Date = date(dateTime)) %>% 
  group_by(Date) %>% 
  summarise(Flow = mean(discharge_cfs))

#Modeling Prep
set.seed(100)
ts_tbl <-  tsibble::as_tsibble(hindcast_daily) |> 
  as_tibble()
Using `Date` as index variable.
Code
splits <- time_series_split(ts_tbl, assess = "36 months", cumulative = TRUE)
Using date_var: Date
Code
training <-  training(splits)
testing  <-  testing(splits)

mods <- list(
  prophet_boost(seasonality_daily = TRUE) |> set_engine("prophet_xgboost")
)

dis_models <- map(mods, ~ fit(.x, Flow ~ Date, data = training))


(dis_models_tbl <- as_modeltime_table(dis_models))
# Modeltime Table
# A tibble: 1 × 3
  .model_id .model   .model_desc
      <int> <list>   <chr>      
1         1 <fit[+]> PROPHET    
Code
(calibration_table <- modeltime_calibrate(dis_models_tbl, testing, quiet = FALSE))
# Modeltime Table
# A tibble: 1 × 5
  .model_id .model   .model_desc .type .calibration_data   
      <int> <list>   <chr>       <chr> <list>              
1         1 <fit[+]> PROPHET     Test  <tibble [1,095 × 4]>
Code
modeltime_accuracy(calibration_table) |> 
  arrange(mae)
# A tibble: 1 × 9
  .model_id .model_desc .type   mae  mape  mase smape  rmse    rsq
      <int> <chr>       <chr> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>
1         1 PROPHET     Test  1258.  159.  4.29  73.4 1584. 0.0827
Code
(forecast <- calibration_table  |> 
  modeltime_forecast(h = "36 months", 
                     new_data = testing,
                     actual_data = ts_tbl))
# Forecast Results
  
Conf Method: conformal_default | Conf Interval: 0.95 | Conf By ID: FALSE
(GLOBAL CONFIDENCE)
# A tibble: 3,380 × 7
   .model_id .model_desc .key   .index     .value .conf_lo .conf_hi
       <int> <chr>       <fct>  <date>      <dbl>    <dbl>    <dbl>
 1        NA ACTUAL      actual 2013-09-18   882.       NA       NA
 2        NA ACTUAL      actual 2013-09-19   641.       NA       NA
 3        NA ACTUAL      actual 2013-09-20   628.       NA       NA
 4        NA ACTUAL      actual 2013-09-21   638.       NA       NA
 5        NA ACTUAL      actual 2013-09-22   675.       NA       NA
 6        NA ACTUAL      actual 2013-09-23   753.       NA       NA
 7        NA ACTUAL      actual 2013-09-24   889.       NA       NA
 8        NA ACTUAL      actual 2013-09-25   993.       NA       NA
 9        NA ACTUAL      actual 2013-09-26   622.       NA       NA
10        NA ACTUAL      actual 2013-09-27   575.       NA       NA
# ℹ 3,370 more rows
Code
plot_modeltime_forecast(forecast)
Code
refit_tbl <- calibration_table |>
    modeltime_refit(data = ts_tbl)

refit_tbl |>
    modeltime_forecast(h = "36 months", actual_data = ts_tbl) |>
    plot_modeltime_forecast()
Code
refit_forecast <- refit_tbl |>
    modeltime_forecast(h = "36 months", actual_data = ts_tbl)

elwha_past <- readNWISuv("12045500", "00060", startDate = "2010-09-18", endDate = "2019-12-19", tz = "UTC") %>% 
  rename(Flow = X_00060_00000, Date = dateTime) %>% 
  mutate(Date = date(Date)) |>                   
  group_by(Date) |>                                   
  summarise(Flow = mean(Flow))
GET:https://nwis.waterservices.usgs.gov/nwis/iv/?site=12045500&format=waterml%2C1.1&ParameterCd=00060&startDT=2010-09-18&endDT=2019-12-19
Code
forecast2 <- rename(refit_forecast, Date = .index)
forecast2 <- rename(forecast2, Flow = .value)

rev_hindcast <- forecast2 %>%
  group_by(Date) %>% 
  summarise(Flow = Flow)

rev_hindcast$Flow <- rev(rev_hindcast$Flow)
  
rev_hindcast <- rev_hindcast %>% 
  mutate(trueDate = Date - 1096)


ggplot() +
  geom_line(data = rev_hindcast, aes(x=trueDate, y = Flow, color = "Predicted")) +
  geom_line(data = elwha_past, aes(x=Date, y = Flow, color = "Actual")) + 
  theme_minimal() + 
  labs(
    x = "Month",
    y = "Flow (Cubic Feet per Second)",
    color = ""
  )

Code
hindcast_plot <- forecast2 %>%
  select(.model_desc,Flow,Date) %>% 
  rename(model = .model_desc)

hindcast_plot$Flow <- rev(hindcast_plot$Flow)
hindcast_plot$model <- rev(hindcast_plot$model)

hindcast_merge <- hindcast_plot %>% 
  mutate(trueDate = Date - 1096) %>% 
  dplyr::filter(model == "PROPHET") %>% 
  rename(Predicted = Flow) %>% 
  mutate(Date2 = as.character(trueDate))

past_merge <- elwha_past %>% 
  rename(Actual = Flow) %>% 
  mutate(Date2 = as.character(Date))

elwha_join <- inner_join(hindcast_merge, past_merge, by = "Date2") %>% 
  select(c(Date2, Predicted, Actual)) %>% 
  rename(Date = Date2)

model_lm<- lm(Predicted ~ Actual, data = elwha_join)
summary(model_lm)

Call:
lm(formula = Predicted ~ Actual, data = elwha_join)

Residuals:
     Min       1Q   Median       3Q      Max 
-2080.85  -432.50    50.11   447.62  1434.33 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 222.37703   35.31404   6.297 4.58e-10 ***
Actual        0.22203    0.01752  12.670  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 596 on 976 degrees of freedom
Multiple R-squared:  0.1412,    Adjusted R-squared:  0.1404 
F-statistic: 160.5 on 1 and 976 DF,  p-value: < 2.2e-16
Code
elwha_join %>% 
  ggplot(aes(x = Actual, y = Predicted)) + 
  geom_point()+
  geom_abline(linetype = 1, color = "black") +
  geom_smooth(color = "red", method = "lm", formula = (y ~ x)) +
  theme_linedraw()

Bednarek, A. T. (2001). Undamming rivers: A review of the ecological impacts of dam removal. Environmental Management, 27, 803–814.
Duda, J. J., Freilich, J. E., & Schreiner, E. G. (2008). Baseline studies in the elwha river ecosystem prior to dam removal: Introduction to the special issue. Northwest Science, 82(sp1), 1–12.
Duda, J. J., Warrick, J. A., & Magirl, C. S. (2011). Coastal and lower elwha river, washington, prior to dam removal—history, status, and defining characteristics. Coastal Habitats of the Elwha River, Washington—Biological and Physical Patterns and Processes Prior to Dam Removal. US Geological Survey Scientific Investigations Report, 5120, 1–26.
Graf, W. (1993). Sustaining our water resources.
Hart, D. D., Johnson, T. E., Bushaw-Newton, K. L., Horwitz, R. J., Bednarek, A. T., Charles, D. F., Kreeger, D. A., & Velinsky, D. J. (2002). Dam removal: Challenges and opportunities for ecological research and river restoration: We develop a risk assessment framework for understanding how potential responses to dam removal vary with dam and watershed characteristics, which can lead to more effective use of this restoration method. BioScience, 52(8), 669–682.
Johnson, S. (2001). Kettle river dam removal: Impacts of sediment on downstream mussel populations. A Meeting of the Freshwater Mollusk Conservation Society.
Lehman, B., Huff, D. D., Hayes, S. A., & Lindley, S. T. (2017). Relationships between chinook salmon swimming performance and water quality in the san joaquin river, california. Transactions of the American Fisheries Society, 146(2), 349–358.
Mishra, P., Pandey, C. M., Singh, U., Gupta, A., Sahu, C., & Keshri, A. (2019). Descriptive statistics and normality tests for statistical data. Annals of Cardiac Anaesthesia, 22(1), 67–72.
Mussman, E. K., Zabowski, D., & Acker, S. A. (2008). Predicting secondary reservoir sediment erosion and stabilization following dam removal. Northwest Science, 82(sp1), 236–245.
Randle, T. J., Bountry, J. A., Ritchie, A., & Wille, K. (2015). Large-scale dam removal on the elwha river, washington, USA: Erosion of reservoir sediment. Geomorphology, 246, 709–728.
Ryan Bellmore, J., Duda, J. J., Craig, L. S., Greene, S. L., Torgersen, C. E., Collins, M. J., & Vittum, K. (2017). Status and trends of dam removal research in the united states. Wiley Interdisciplinary Reviews: Water, 4(2), e1164.
Simons, R. K., & Simons, D. B. (1991). Sediment problems associated with dam removal, muskegon river, michigan. Hydraulic Engineering, 680–685.